Almost Optimal Variance-Constrained Best Arm Identification

نویسندگان

چکیده

We design and analyze Variance-Aware-Lower Upper Confidence Bound (VA-LUCB), a parameter-free algorithm, for identifying the best arm under fixed-confidence setup stringent constraint that variance of chosen is strictly smaller than given threshold. An upper bound on VA-LUCB’s sample complexity shown to be characterized by fundamental variance-aware hardness quantity $H_{\mathrm {VA}}$ . By proving an information-theoretic lower bound, we show VA-LUCB optimal up factor logarithmic in Extensive experiments corroborate dependence various terms comparing empirical performance close competitor RiskAverse-UCB-BAI David et al. (2018) our suggest has lowest this class risk-constrained identification problems, especially riskiest instances.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal Best Arm Identification with Fixed Confidence

We give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the ‘Track-and-Stop’ strategy, which we prove to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping...

متن کامل

On the Optimal Sample Complexity for Best Arm Identification

We study the best arm identification (Best-1-Arm) problem, which is defined as follows. We are given n stochastic bandit arms. The ith arm has a reward distribution Di with an unknown mean μi. Upon each play of the ith arm, we can get a reward, sampled i.i.d. from Di. We would like to identify the arm with largest mean with probability at least 1− δ, using as few samples as possible. We also st...

متن کامل

Towards Instance Optimal Bounds for Best Arm Identification

In the classical best arm identification (Best-1-Arm) problem, we are given n stochastic bandit arms, each associated with a reward distribution with an unknown mean. Upon each play of an arm, we can get a reward sampled i.i.d. from its reward distribution. We would like to identify the arm with the largest mean with probability at least 1 − δ, using as few samples as possible. The problem has ...

متن کامل

Multi-Bandit Best Arm Identification

We study the problem of identifying the best arm in each of the bandits in a multibandit multi-armed setting. We first propose an algorithm called Gap-based Exploration (GapE) that focuses on the arms whose mean is close to the mean of the best arm in the same bandit (i.e., small gap). We then introduce an algorithm, called GapE-V, which takes into account the variance of the arms in addition t...

متن کامل

Open Problem: Best Arm Identification: Almost Instance-Wise Optimality and the Gap Entropy Conjecture

The best arm identification problem (BEST-1-ARM) is the most basic pure exploration problem in stochastic multi-armed bandits. The problem has a long history and attracted significant attention for the last decade. However, we do not yet have a complete understanding of the optimal sample complexity of the problem: The state-of-the-art algorithms achieve a sample complexity of O( ∑n i=2 ∆ −2 i ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Information Theory

سال: 2023

ISSN: ['0018-9448', '1557-9654']

DOI: https://doi.org/10.1109/tit.2022.3222231